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ABSTRACT 

The examination of Student Evaluation Instruments (SEI) has generated a considerable literature. 
Interestingly, this extensive literature provides no clear guidance on how to interpret SEI results 
in order to make comparative evaluations of instructors ’ performances. The research presented in 
this paper draws upon six semesters worth of SEI responses for all courses in our school of 
business - a database of nearly 30,000 responses. The paper examines how core measures of 
teaching effectiveness — student evaluation of instructor’s teaching ability and willingness to 
recommend the instructor - are affected by several factors. These factors include: the department 
from which the course was offered; whether the course was required by the core, the department 
or was an elective; the status of the student and the anticipated grade. Statistical analyses are 
conducted to examine and determine the impact of these factors and their interactions. The goal is 
to develop a system that can more accurately gauge instructors ’ performances as measured by the 
student evaluation instrument. 


INTRODUCTION 

ncreasingly, colleges and universities are under the expectation to prove excellence in teaching. The 
y V ability to demonstrate such excellence in teaching and learning is being driven by public demand and 
v ^ accrediting agencies. Most institutions of higher learning have developed systems for measuring teaching 
effectiveness. These systems include a variety of approaches that include evaluation of instructional design, 
monitoring course management, and instructional delivery. This last factor can be evaluated by means of peer 
review and student evaluations. Student evaluation of instruction is generally acquired through a standardized 
instrument. Taken together, these mechanisms play a critical role in the academic life since they are crucial in the 
promotion and tenure processes. Seldin (1993) argued that student evaluation instruments (SEI) are the dominant 
factor used by administrators when they evaluate teaching effectiveness. Becker and Watts (1999) have argued that 
the measurement derived from SEIs contributed 50% to 60% of the overall evaluation of teaching effectiveness. 
Hobson and Talbot (2001) and Richardson (2005) have stated that universities use SEIs as their primary method for 
evaluating teaching effectiveness. Comm and Matthasiel (1998) found that 94% of business schools responding to a 
survey use SEIs as one means of evaluation instruction. This high proportion may be due to an assessment 
requirement of AACSB (1994). 


LITERATURE REVIEW 


Importance 

Given their importance, it is not terribly surprising that the study of Student Evaluation Instruments has 
engendered a huge literature. Cashin (1995) stated that there were over 1,500 books and articles on SEIs. Wilson 
(1998) reported that since their first use, there have been over 2, 000 articles on this subject. Al-Issa and Sulieman 
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(2007) found 2,988 articles on SEIs published between 1990 and 2005. Therefore, it is not surprising that this 
method of evaluating teaching performance has gained widespread use in most universities and colleges (Hobson & 
Talbot, 2001; Richardson, 2005) and is often the primary method used to evaluate classroom teaching performance 
(Yunker & Sterner, 1988). Comm and Mathaisel (1998) indicate that almost all AACSB accredited business schools 
responding to a survey use Student Evaluation Instruments as an element in determining teaching effectiveness. 

Validity 


Much of the SE1 literature examines the validity of the SEI as a tool to assess teaching effectiveness 
(Clayson and Sheffet 2006; Glynn et al. 2006; Green et al. 1998; Soper 1973; Rodin and Rodin 1973; Sopher 1973; 
Morgan et al. 2003). Several studies tend to support the validity of SEIs (Aleamoni, 1999; Wachtel, 1998). 
However, there still exist serious questions as to whether SEIs should be used as a primary measure of teaching 
effectiveness. Also, there is the question of whether students are in a position to accurately evaluate the teaching 
capabilities of their professors. Some argue that students cannot evaluate [Caskin (1983); Selden (1984), Newton 
(1988); Bures, DeRidder and Tong (1990); and Richer (1996)]. 

That argument was predicated on students’ inability to distinguish between attitudes toward the instructor 
and the instructor’s actual effectiveness as a teacher. Clayson and Sheffet (2006) presented evidence of a strong 
relationship between students’ perception of the instructor’s personality and their evaluation of instructional 
effectiveness in marketing and business core courses. It was found [Aigner and Thum (1986)] that instructor’s 
enthusiasm, along with other factors, exhibited significantly positive influences on an instructor’s rating. Williams 
and Ceci (1997) found that SEI ratings are significantly influenced by instructors’ personality factors. Clayson 
(1999) presented evidence that the majority of variance in SEI results were attributable to personality. Not all 
authors [Centra (1993) and Braskamp et. al. (1944)] agree with the notion that personality is a major determinant of 
SEI results. 

Demographic factors have been included in many of the studies. Race and gender of instructors have also 
been investigated as possible factors. Smith and Anderson’s (2005) study of female Hispanic faculty found that they 
received much lower scores on their SEIs than their Anglo counterparts. 

Many researchers [Leslie, Kellams & Gunne (1982); Gappa (1984); and Bruno (2003)] examined the 
employment status of the instructor and found that full-time faculty members generally received higher scores than 
part-time faculty. 

Other researchers have looked at the impact of course workload on SEIs. Not surprisingly, several studies 
(Stapelton et al., 2001, and Paswan and Young, 2002) clearly indicated a negative relationship between increased 
course demands (materials, workload, and homework) and the results of student evaluations of their instructors. 
Course demands (as measured by hours per week required outside of class) were found by Aigner and Thum (1986) 
to have a significant negative impact. 

Course Type 

For the purpose of this research, it is important to review research on the role that the course type plays on 
evaluations. Current research has shown a relationship between the student’s reason for taking the course 
(requirement for the school core, requirement for the major, or an elective) and the student’s perception of the 
professor. Elective courses are rated higher than non-elective courses (Marsh, 1987; Feldman, 1978). Required 
courses outside the student’s major receive the lowest ratings (Marsh, 1987; Feldman, 1978). Boex (2000) 
identified that student-instructor interaction had a significantly positive impact on effectiveness ratings in core-level 
courses, but not for non-core level courses. 

Methodology Criticisms 

The significant portion of the SEI literature has also been criticized for being methodologically flawed. 
Among several methodological problems identified by Marsh (1987) in this type of research were implying 
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causation from correlation, use of an inappropriate unit of analysis, and not properly accounting for the multivariate 
nature of SEIs and potential biases. 

All of these varied factors support the issue that SEIs should not be the sole basis for evaluation. Green et 
al. (1998) recommend that accounting departments should reevaluate their SEIs to remove items that students cannot 
assess. Further, they note that SEIs should be designed to capture data on course materials and curriculum 
design/course development, as well as other relevant dimensions of effective teaching. 

METHOD 

Student Evaluation Instrument 

As part of the Quinnipiac School of Business’ quest for AACSB 15 years ago, we initiated an assessment 
program. Part of that program involved the development and use of a student evaluation instrument. The instrument 
has 21 close ended and two open-ended questions (provided in Appendix 1). In order to assure anonymity, no 
demographic questions, other than the student’s status {freshman, sophomore, junior, senior or graduate student), 
are collected. Data are also collected on the categorization of the course: a business core, a major’s core, or an 
elective. We also inquire the extent to which the student is keeping up with materials for the course and the 
expectation of their grade. The remaining 17 close ended questions focus on the student’s perception of particular 
aspects of the course and their instructor’s teaching ability. These questions are scored on a 5-point Likert-scale. 
Two items - students’ evaluation of the instructor’s Teaching Ability and whether the student would Recommend this 
instructor to a friend - are of particular importance during the evaluation process. This paper will singularly focus on 
the Teaching Ability score. This question is coded such that the more favorable the evaluation of the instructor’s 
teaching ability, the higher the score (l-Poor~5-Excellent). 

Sample 


We examined data collected from our Student Evaluation Instrument for six semesters (three years) for the 
entire School of Business. The results yielded nearly 30,000 useable responses across all business majors. 

Our examination centered on two questions from our University’s School of Business SEI - the students’ 
evaluation of the instructor’s Teaching Ability and whether the student would Recommend this instructor to a friend. 
At our institution, when evaluating a faculty member for continued employment, promotion and tenure, these two 
items take precedence in terms of importance. 

RESULTS 

Total Sample 

Table 1 provides the number of observations, mean score, and standard deviation for the total sample by 
semester for the two Likert-scale questions of highest importance in our promotion and tenure process. Teaching 
Ability and Recommendation. 

It should be noted that our evaluation instrument codes the courses by department; however, there are three 
sets of courses - Health Management, Business Law and Quantitative Methods - that are coded separately. We did 
not include these courses in our study since the instructors were not evaluated within the normal context of a 
departmental review. These three represent approximately 4% of the total sample. Table 3 provides percent of total 
observations for the six semesters analyzed. 

Comparison Measure 

In Tables 2 and 3, we provide the mean score for Teaching Ability and Recommendation, respectively, for 
each semester by department. The last row in each table. School of Business, represents the number of courses given 
this designation. It includes our cornerstone and capstone courses along with several one credit courses. Faculty 
members from different departments teach these courses. Scores from these courses are included in their 
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evaluations; therefore, we have included them in our analysis. Data were not collected for these courses during the 
Fall 2002 semester. 


Table 1: Teaching Ability and Recommendation Scores by Semester 



Teaching Ability 

Recommend Teacher 


Count 

Mean 

Std. Dev. 

Count 

Mean 

Std. Dev. 

Fall 2002 

4,312 

3.91 

1.07 

4,199 

1.88 

1.14 

Spring 2003 

4,605 

3.84 

1.10 

4,499 

1.94 

1.16 

Fall 2003 

5,348 

3.88 

1.08 

5,224 

1.93 

1.16 

Spring 2004 

5,395 

3.91 

1.04 

5,287 

1.90 

1.11 

Fall 2004 

5,212 

3.78 

1.16 

4,848 

1.95 

1.15 

Spring 2005 

4,724 

3.96 

1.05 

4,612 

1.88 

1.13 

Overall 

29,596 

3.88 

1.09 

28,660 

1.91 

1.14 


Table 2: Mean Scores for Teaching Ability by Semester and by Department 



F2002 

S2003 

F2003 

S2004 

F2004 

S2005 

Accounting 

4.00 

3.94 

3.74 

3.89 

3.90 

4.11 

CIS 

4.11 

3.87 

4.11 

4.18 

4.14 

4.14 

Economics 

4.04 

3.93 

4.09 

4.10 

4.02 

3.92 

Finance 

3.43 

3.66 

3.67 

3.97 

3.60 

3.86 

IB 

3.69 

3.79 

3.95 

3.73 

3.84 

3.50 

Management 

3.83 

3.66 

3.63 

3.96 

3.86 

4.02 

Marketing 

3.99 

3.88 

4.22 

4.14 

4.06 

4.13 

School of Business 


3.58 

3.55 

3.39 

2.82 

3.86 


Table 3: Mean Scores for Recommendation by Semester and by Department 



F2002 

S2003 

F2003 

S2004 

F2004 

S2005 

Accounting 

1.76 

1.73 

1.96 

1.89 

2.00 

1.74 

CIS 

1.60 

1.85 

1.63 

1.65 

1.69 

1.74 

Economics 

1.79 

1.93 

1.77 

1.69 

1.82 

1.92 

Finance 

2.30 

2.21 

2.17 

1.82 

2.20 

1.95 

IB 

2.16 

2.03 

1.92 

2.10 

1.99 

2.32 

Management 

1.86 

2.02 

2.08 

1.78 

1.90 

1.84 

Marketing 

1.94 

2.01 

1.59 

1.73 

1.85 

1.68 

School of Business 


2.21 

2.37 

2.48 

2.42 

2.10 


Obviously, use of a global standard would be preferable and easier, given the uniformity it would provide. 
However, the statistical differences between departments on both measures would bring into question the validity 
and, more importantly, fairness of using a single global measure. The same would be true if we found differences 
amongst the rank of the students (freshmen, sophomores, juniors, seniors and graduate students) or the nature of the 
course requirement ( core requirement, requirement for major or elective). It would also be critical to identify if the 
students’ anticipated grade might influence the outcomes on both measures. To address the question of whether 
there were significant variations, we conducted a Generalized Linear ANOVA test using SPSS. The results for the 
Teaching Ability measure are presented in Table 4, while the results for Recommendation measure are presented in 
Table 5. 
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Table 4: Results for GLM on Teaching Ability 


Effect 

Sum of Squares 

Df 

Mean Square 

F 

Significance 

Corrected Model 

5780.79 

722 

8.01 

7.96 

>.001 

Intercept 

4313.87 

1 

4313.87 

4290.13 

>.001 

Department 

30.12 

11 

2.78 

2.72 

.002 

Ranking 

6.94 

4 

1.74 

1.73 

.141 

Requirement 

31.95 

4 

7.99 

7.94 

>.001 

Grade 

52.28 

4 

13.07 

13.00 

>.001 

Department * 

Ranking 

97.94 

39 

2.51 

2.50 

>.001 

Department * 

Requirement 

62.45 

39 

1.60 

1.59 

.011 

Department * 

Grade 

76.46 

43 

1.78 

1.77 

.001 

Ranking * 

Requirement 

38.59 

16 

2.41 

2.40 

.001 

Ranking * Grade 

14.10 

16 

.88 

.88 

.597 

Requirement * 

Grade 

19.06 

16 

1.19 

1.19 

.271 

Department * 

Ranking * 

Requirement 

148.52 

107 

1.38 

1.38 

.006 

Department * 

Ranking * Grade 

129.81 

120 

1.08 

1.08 

.270 

Department * 

Requirement * 

Grade 

114.00 

92 

1.24 

1.23 

.065 

Ranking * 

Requirement * 

Grade 

68.52 

46 

1.49 

1.48 

.019 

Department * 

Ranking * 

Requirement * 

Grade 

150.78 

147 

1.03 

.02 

.417 

Error 

28788.48 

28630 

1.01 



Total 

476138.00 

29353 




Corrected Total 

34569.26 

29352 





Table 5: Results for GLM on Recommendation 


Effect 

Sum of Squares 

df 

Mean Square 

F 

Significance 

Corrected Model 

5777.39 

683 

8.46 

7.53 

>.001 

Intercept 

1281.82 

1 

1281.82 

1140.75 

>.001 

Department 

37.10 

11 

3.37 

3.00 

.001 

Ranking 

6.06 

4 

1.52 

1.35 

.249 

Requirement 

5.54 

4 

1.38 

1.23 

.295 

Grade 

92.06 

4 

23.01 

20.48 

>.001 

Department * 

Ranking 

80.53 

39 

2.07 

1.84 

.001 

Department * 

Requirement 

73.25 

39 

1.88 

1.67 

.005 

Department * 

Grade 

100.63 

42 

2.40 

2.13 

>.001 

Ranking * 

Requirement 

22.16 

16 

1.39 

1.23 

.233 

Ranking * Grade 

15.89 

16 

.99 

.88 

.589 

Requirement * 

Grade 

24.68 

16 

1.54 

1.37 

.145 
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Table 5: Results for GLM on Recommendation (continued) 


Effect 

Sum of Squares 

df 

Mean Square 

F 

Significance 

Department * 

Ranking * 

Requirement 

165.95 

104 

1.60 

1.42 

.003 

Department * 

Ranking * Grade 

156.79 

131 

1.39 

1.24 

.046 

Department * 

Requirement * 

Grade 

131.57 

82 

1.61 

1.43 

.007 

Ranking * 

Requirement * 

Grade 

62.40 

43 

1.45 

1.29 

.095 

Department * 

Ranking * 

Requirement * 

Grade 

149.62 

136 

1.10 

.98 

.553 

Error 

31147.94 

27720 

1.24 



Total 

141079.00 

28404 




Corrected Total 

36925.33 

28403 





The results indicate that the Teaching Ability measure is significantly influenced by the department that 
offers the course, course’s requirement, and by the students’ anticipated grades. Taken individually, it would appear 
that the student ranking is not significant. The results further indicate that the Recommendation measure is 
significantly influenced by the department that offers the course and by the students’ anticipated grades. It would 
appear that neither the student ranking nor the nature of the course’s requirement are statistically significant. 

The interaction effects also appear to be influenced heavily by what department the course was offered. 
These results provide strong evidence that a school-wide measure would be inappropriate. 

CONCLUSIONS 

Particular attention should be given to the standard by which to measure performance. Our research 
findings indicate that there are significant differences across departments. As a result, for promotion and tenure 
decisions, consideration should be given to the use of department measures in evaluations rather than a universal 
measure, such as the overall school mean. Prior studies primarily focused on the SEIs instrument without analysis 
of the appropriate measurement standard. 

This research provides empirical data to support the use of a more appropriate standard to adequately assess 
teaching effectiveness. The measurement standard (overall school versus department) must be considered in the 
evaluation process. In this way, the SEI evaluation process can more accurately appraise teaching ability. It is safe 
to say that student evaluation instruments will remain in use; however, how the standard by which they are measured 
may change as a result of considerations presented in this research. 

FUTURE RESEARCH 

Although this research did not explore beyond the measurement standard, the data we collected allows us to 
investigate a variety of factors in future research. The existing literature appears to indicate that the requirement 
status of the course, and where it falls in the curriculum, impacts a student’s perception and resulting evaluation of 
the instructor. Since course requirement status appears to impact ratings, consideration must be given to this 
mitigating factor when evaluating a faculty member. 

With respect to the students’ anticipated grades and their evaluations of the instructor’s teaching, our future 
research could confirm prior research (Nelson and Lynch 1984; Mehdizadeh 1990; Stratton et al 1994; Isley and 
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Singh 2005; McPherson 2006) where lower evaluations resulted from lower anticipated grades and higher 
anticipated grades resulted in higher teaching evaluations. 
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APPENDIX 1 


You are a: 

Freshman 

Sophomore 

Junior 

Senior 

Graduate 

Student 

Is this course: 

Required for 
Core 

Required for 
Major 

Elective 



Rate the instructor’s teaching ability in this 
class 

Poor 

Fair 

Good 

Very Good 

Excellent 

How are you doing in keeping up with 
assignments and readings - Percent complete: 

0-20% 

21-40% 

41-60% 

61-80% 

81-100% 

Expected Grade: 

A 

B 

C 

D 

F 

I have become more competent in this area 
due to this course. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

I have increased my overall knowledge of the 
subject matter. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

I feel challenged intellectually by this course. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor presents the material too 
rapidly. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor gives assignments are too 
difficult. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor is available to provide extra 
help. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor provides clear answers to the 
student questions. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor encourages class discussion. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor brings current ideas to the 
classroom. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor has the course well organized. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor summarizes main points and 
provides emphasis on material. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor relates course concepts in 
systematic fashion. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor seems to enjoy teaching. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor is friendly and considerate to 
students. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

The instructor is enthusiastic about the course 
material. 

Strongly 

Agree 

1 ^ 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 

I would recommend taking another course 
with this instructor to a friend. 

Strongly 

Agree 

1 " 

Agree 

2 

Neither 

3 

Disagree 

4 

Strongly 

Disagree 

5 


30 





























